A Fast Learning Agent Based on the Dyna Architecture
نویسندگان
چکیده
In this paper, we present a rapid learning algorithm called Dyna-QPC. The proposed algorithm requires considerably less training time than Q-learning and Table-based Dyna-Q algorithm, making it applicable to real-world control tasks. The Dyna-QPC algorithm is a combination of existing learning techniques: CMAC, Q-learning, and prioritized sweeping. In a practical experiment, the Dyna-QPC algorithm is implemented with the goal of minimizing the learning time required for a robot to navigate a discrete statespace containing obstacles. The robot learning agent uses Q-learning for policy learning and a CMAC-Model as an approximator of the system environment. The prioritized sweeping technique is used to manage a queue of previously influential state-action pairs used in a planning function. The planning function is implemented as a background task updating the learning policy based on previous experience stored by the approximation model. As background tasks run during CPU idle time, there is no additional loading on the system processor. The Dyna-QPC agent switches seamlessly between real and virtual modes with the objective of achieving rapid policy learning. A simulated and an experimental scenario have been designed and implemented. The simulated scenario is used to test the speed and efficiency of the three learning algorithms, while the experimental scenario evaluates the new Dyna-QPC agent. Results from both simulated and experimental scenarios demonstrate the superior performance of the proposed learning agent.
منابع مشابه
An Architectural Framework for Integrated Multiagent Planning, Reacting, and Learning
Dyna is a single-agent architectural framework that integrates learning, planning, and reacting. Well known instantiations of Dyna are Dyna-AC and Dyna-Q. Here a multiagent extension of Dyna-Q is presented. This extension, called M-Dyna-Q, constitutes a novel coordination framework that bridges the gap between plan-based and reactive coordination in multiagent systems. The paper summarizes the ...
متن کاملIntegrated Architectures for Learning , Planning , and ReactingBased
This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned model of the world. In this paper, I present and show results for two Dyna archi...
متن کاملA Multiagent Variant of Dyna-Q
This paper describes a multiagent variant of Dyna-Q called M-Dyna-Q. Dyna-Q is an integrated single-agent framework for planning, reacting, and learning. Like DynaQ, M-Dyna-Q employs two key ideas: learning results can serve as a valuable input for both planning and reacting, and results of planning and reacting can serve as a valuable input to learning. M-Dyna-Q extends Dyna-Q in that planning...
متن کاملWp-dyna: Planning and Reinforcement Learning in Well-plannable Environments
Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...
متن کاملIntegrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
This paper extends previous work with Dyna a class of architectures for intelligent systems based on approximating dynamic program ming methods Dyna architectures integrate trial and error reinforcement learning and execution time planning into a single process operating alternately on the world and on a learned model of the world In this paper I present and show results for two Dyna archi tect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 30 شماره
صفحات -
تاریخ انتشار 2014